Optimal Sub-sampling with Influence Functions

نویسندگان

  • Daniel Ting
  • Eric Brochu
چکیده

Abstract Sub-sampling is a common and often effective method to deal with the computational challenges of large datasets. However, for most statistical models, there is no well-motivated approach for drawing a non-uniform subsample. We show that the concept of an asymptotically linear estimator and the associated influence function leads to optimal sampling procedures for a wide class of popular models. Furthermore, for linear regression models which have well-studied procedures for non-uniform sub-sampling, we show our optimal influence function based method outperforms previous approaches. We empirically show the improved performance of our method on real datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

THE EFFECTS OF INITIAL SAMPLING AND PENALTY FUNCTIONS IN OPTIMAL DESIGN OF TRUSSES USING METAHEURISTIC ALGORITHMS

Although Genetic algorithm (GA), Ant colony (AC) and Particle swarm optimization algorithm (PSO) have already been extended to various types of engineering problems, the effects of initial sampling beside constraints in the efficiency of algorithms, is still an interesting field. In this paper we show that, initial sampling with a special series of constraints play an important role in the conv...

متن کامل

Optimal sub-Nyquist nonuniform sampling and reconstruction for multiband signals

We study the problem of optimal sub-Nyquist sampling for perfect reconstruction of multiband signals. The signals are assumed to have a known spectral support that does not tile under translation. Such signals admit perfect reconstruction from periodic nonuniform sampling at rates approaching Landau’s lower bound equal to the measure of . For signals with sparse , this rate can be much smaller ...

متن کامل

Sampling – Reconstruction Procedure of Gaussian Fields Procedimiento para el Muestreo y Reconstrucción de Campos Gausianos

The description of the optimal Sampling – Reconstruction Procedure (SRP) of Gaussian fields is given on the basis of the conditional mean rule when the quantity of samples is limited. The Gaussian fields are described by two types of space covariance function: exponential and Gaussian. A lot of both reconstruction and reconstruction error surfaces are obtained by numerical calculation. We chang...

متن کامل

Particle swarm optimization for a bi-objective web-based convergent product networks

Here, a collection of base functions and sub-functions configure the nodes of a web-based (digital)network representing functionalities. Each arc in the network is to be assigned as the link between two nodes. The aim is to find an optimal tree of functionalities in the network adding value to the product in the web environment. First, a purification process is performed in the product network ...

متن کامل

Convergence rates of sub-sampled Newton methods

We consider the problem of minimizing a sum of n functions via projected iterations onto a convex parameter set C ⇢ R, where n p 1. In this regime, algorithms which utilize sub-sampling techniques are known to be effective. In this paper, we use sub-sampling techniques together with low-rank approximation to design a new randomized batch algorithm which possesses comparable convergence rate to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1709.01716  شماره 

صفحات  -

تاریخ انتشار 2017